Stylometric features for emotion level classification in news related blogs
نویسندگان
چکیده
Breaking news and events are often posted in the blogosphere before they are published by any media agency. Therefore, the blogosphere is a valuable resource for news-related blog analysis. However, it is crucial to first sort out newsunrelated content like personal diaries or advertising blogs. Besides, there are different levels of emotionality or involvement which bias the news information to a certain extent. In our work, we evaluate topic-independent stylometric features to classify blogs into news versus rest and to assess the emotionality in these blogs. We apply several text classifiers to determine the best performing combination of features and algorithms. Our experiments revealed that with simple style features, blogs can be classified into news versus rest and their emotionality can be assessed with accuracy values of almost 80%.
منابع مشابه
Stylometric Analysis of Bloggers' Age and Gender
We report results of stylometric differences in blogging for gender and age group variation. The results are based on two mutually independent features. The first feature is the use of slang words which is a new concept proposed by us for Stylometric study of bloggers. Slang is a non-dictionary word that has evolved with time due to its frequent and popular usage. For the second feature, we hav...
متن کاملFacet Classification of Blogs: Know-Center at the TREC 2009 Blog Distillation Task
In this paper, we outline our experiments carried out at the TREC 2009 Blog Distillation Task. Our system is based on a plain text index extracted from the XML feeds of the TREC Blogs08 dataset. This index was used to retrieve candidate blogs for the given topics. The resulting blogs were classified using a Support Vector Machine that was trained on a manually labelled subset of the TREC Blogs0...
متن کاملEvent Based Emotion Classification for News Articles
Reading of news articles can trigger emotional reactions from its readers. But comparing to other genre of text, news articles that are mainly used to report events, lack emotion linked words and other features for emotion classification. In this paper, we propose an event anchor based method for emotion classification for news articles. Firstly, we build an emotion linked news corpus through c...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملUsing Roget's Thesaurus for Fine-grained Emotion Recognition
Recognizing the emotive meaning of text can add another dimension to the understanding of text. We study the task of automatically categorizing sentences in a text into Ekman’s six basic emotion categories. We experiment with corpus-based features as well as features derived from two emotion lexicons. One lexicon is automatically built using the classification system of Roget’s Thesaurus, while...
متن کامل